In collaboration with Tim Bendel from Bank of America, the Trainsformers team is focusing on strengthening U.S. railroads against climate-related disruptions. This project employs advanced data analytics and geospatial technologies to assess how rising temperatures and extreme weather can damage rail infrastructure and disrupt services.
While climate catastrophes often damage regions directly in their path, their effects ripple far beyond through the interconnected webs of infrastructure. Understanding these secondary impacts is vital as they can often exceed the initial economic damages. By integrating comprehensive datasets from from the National Oceanic and Atmospheric Administration NOAA and data from the U.S. Department of Transportation USDOT, we will identify and map these indirect risks to railway systems that play an integral role in the nation’s economic engine.
Our objective is to leverage data engineering and geospatial analytics to develop a predictive model that quantifies the vulnerability of U.S. rail networks to future climate-related disruptions. The insights gained will not only spotlight regions of indirect vulnerability but also guide infrastructural fortification efforts, ensuring that rail lines remain resilient in the face of climate uncertainty. With parallels to research published in the field, our project will navigate the complex landscape of environmental data, rail infrastructure, and economic implications.
The map below displays an extensive visualization of all rail network lines across the continental United States.
BEGIN_YEARMONTH: The year and month when the climate event began, formatted as a six-digit number (YYYYMM).
BEGIN_DAY: The day of the month when the climate event started, represented by a two-digit number.
END_YEARMONTH: The year and month when the climate event ended, formatted in the same manner
END_DAY: The day of the month on which the climate event concluded, following the same format
BEGIN_DAY: The day of the month when the climate event started, represented by a two-digit number.
EVENT_TYPE: The type of Climate Event being recorded.
CZ_FIPS: Specific State ID number
DAMAGE_PROPERTY: The total amount of damage in dollars as a result of an event.
BEGIN_LAT: The specific Latitude where the event took place.
BEGIN_LON: The specific Longitude where the event took place.
Objectid: Node id; unique id allocated per node
Frfranode: an ID for the segment (arc) from which the rail line starts
Tofranode: an identifier for the node where the rail network line ends
stateab: The state abbreviation
county: The Country abbreviation
division: Section of the US (i.e Mid America)
timezone: an identifier of timezone
shape_Length: represents the length of each segment of the rail line
miles: The length of the segments in miles
| BEGIN_YEARMONTH | BEGIN_DAY | BEGIN_TIME | END_YEARMONTH | END_DAY | END_TIME | EPISODE_ID | EVENT_ID | STATE | STATE_FIPS | YEAR | MONTH_NAME | EVENT_TYPE | CZ_TYPE | CZ_FIPS | CZ_NAME | WFO | BEGIN_DATE_TIME | CZ_TIMEZONE | END_DATE_TIME | INJURIES_DIRECT | INJURIES_INDIRECT | DEATHS_DIRECT | DEATHS_INDIRECT | DAMAGE_PROPERTY | DAMAGE_CROPS | SOURCE | MAGNITUDE | MAGNITUDE_TYPE | FLOOD_CAUSE | CATEGORY | TOR_F_SCALE | TOR_LENGTH | TOR_WIDTH | TOR_OTHER_WFO | TOR_OTHER_CZ_STATE | TOR_OTHER_CZ_FIPS | TOR_OTHER_CZ_NAME | BEGIN_RANGE | BEGIN_AZIMUTH | BEGIN_LOCATION | END_RANGE | END_AZIMUTH | END_LOCATION | BEGIN_LAT | BEGIN_LON | END_LAT | END_LON | EPISODE_NARRATIVE | EVENT_NARRATIVE | DATA_SOURCE | YearMonth | Total_Damage | DAMAGE_PROPERTY_NUM | DAMAGE_CROPS_NUM | V56 | V57 | V58 | V59 | V60 | V61 | V62 | V63 | V64 | V65 | V66 | V67 | V68 | V69 | V70 | V71 | V72 | V73 | V74 | V75 | V76 | V77 | V78 | V79 | V80 | V81 | V82 | V83 | V84 | V85 | V86 | V87 | V88 | V89 | V90 | V91 | V92 | V93 | V94 | V95 | V96 | V97 | V98 | V99 | V100 | V101 | V102 | V103 | V104 | V105 | V106 | V107 | V108 | V109 | V110 | V111 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 195004 | 28 | 1445 | 195004 | 28 | 1445 | NA | 10096222 | OKLAHOMA | 40 | 1950 | April | Tornado | C | 149 | WASHITA | NA | 4/28/1950 14:45 | CST | 4/28/1950 14:45 | 0 | 0 | 0 | 0 | 250 | 0 | NA | 0 | NA | NA | NA | F3 | 3.4 | 400 | NA | NA | NA | NA | 0 | NA | NA | 0 | NA | NA | 35.12 | -99.2 | 35.17 | -99.2 | NA | NA | PUB | 195004 | 250 | 25 | 0 | NA | NA | NA | NA | NA | |||||||||||||||||||||||||||||||||||||||||||||||||||
| 195004 | 29 | 1530 | 195004 | 29 | 1530 | NA | 10120412 | TEXAS | 48 | 1950 | April | Tornado | C | 93 | COMANCHE | NA | 4/29/1950 15:30 | CST | 4/29/1950 15:30 | 0 | 0 | 0 | 0 | 25 | 0 | NA | 0 | NA | NA | NA | F1 | 11.5 | 200 | NA | NA | NA | NA | 0 | NA | NA | 0 | NA | NA | 31.9 | -98.6 | 31.73 | -98.6 | NA | NA | PUB | 195004 | 25 | 2 | 0 | NA | NA | NA | NA | NA | |||||||||||||||||||||||||||||||||||||||||||||||||||
| 195007 | 5 | 1800 | 195007 | 5 | 1800 | NA | 10104927 | PENNSYLVANIA | 42 | 1950 | July | Tornado | C | 77 | LEHIGH | NA | 7/5/1950 18:00 | CST | 7/5/1950 18:00 | 2 | 0 | 0 | 0 | 25 | 0 | NA | 0 | NA | NA | NA | F2 | 12.9 | 33 | NA | NA | NA | NA | 0 | NA | NA | 0 | NA | NA | 40.58 | -75.7 | 40.65 | -75.47 | NA | NA | PUB | 195007 | 25 | 2 | 0 | NA | NA | NA | NA | NA | |||||||||||||||||||||||||||||||||||||||||||||||||||
| 195007 | 5 | 1830 | 195007 | 5 | 1830 | NA | 10104928 | PENNSYLVANIA | 42 | 1950 | July | Tornado | C | 43 | DAUPHIN | NA | 7/5/1950 18:30 | CST | 7/5/1950 18:30 | 0 | 0 | 0 | 0 | 2.5 | 0 | NA | 0 | NA | NA | NA | F2 | 0 | 13 | NA | NA | NA | NA | 0 | NA | NA | 0 | NA | NA | 40.6 | -76.75 | NA | NA | NA | NA | PUB | 195007 | 2.5 | 2 | 0 | NA | NA | NA | NA | NA | |||||||||||||||||||||||||||||||||||||||||||||||||||
| 195007 | 24 | 1440 | 195007 | 24 | 1440 | NA | 10104929 | PENNSYLVANIA | 42 | 1950 | July | Tornado | C | 39 | CRAWFORD | NA | 7/24/1950 14:40 | CST | 7/24/1950 14:40 | 0 | 0 | 0 | 0 | 2.5 | 0 | NA | 0 | NA | NA | NA | F0 | 0 | 33 | NA | NA | NA | NA | 0 | NA | NA | 0 | NA | NA | 41.63 | -79.68 | NA | NA | NA | NA | PUB | 195007 | 2.5 | 2 | 0 | NA | NA | NA | NA | NA | |||||||||||||||||||||||||||||||||||||||||||||||||||
| 195008 | 29 | 1600 | 195008 | 29 | 1600 | NA | 10104930 | PENNSYLVANIA | 42 | 1950 | August | Tornado | C | 17 | BUCKS | NA | 8/29/1950 16:00 | CST | 8/29/1950 16:00 | 0 | 0 | 0 | 0 | 2.5 | 0 | NA | 0 | NA | NA | NA | F1 | 1 | 33 | NA | NA | NA | NA | 0 | NA | NA | 0 | NA | NA | 40.22 | -75 | NA | NA | NA | NA | PUB | 195008 | 2.5 | 2 | 0 | NA | NA | NA | NA | NA |
| objectid | fraarcid | frfranode | tofranode | stfips | cntyfips | stcntyfips | stateab | country | fradistrct | rrowner1 | rrowner2 | rrowner3 | trkrghts1 | trkrghts2 | trkrghts3 | trkrghts4 | trkrghts5 | trkrghts6 | trkrghts7 | trkrghts8 | trkrghts9 | division | subdiv | branch | yardname | passngr | stracnet | tracks | net | miles | km | timezone | shape_Length |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 300000 | 348741 | 348746 | 38 | 15 | 38015 | ND | US | 8 | DMVW | WESTERN | EIGHTH | XLINE | 1 | M | 0.1781007 | 0.2866258 | C | 0.0031940 | ||||||||||||||
| 2 | 300001 | 338567 | 338686 | 30 | 87 | 30087 | MT | US | 8 | BNSF | 0 | O | 0.8865854 | 1.4268238 | M | 0.0172269 | |||||||||||||||||
| 3 | 300002 | 330112 | 330117 | 16 | 31 | 16031 | ID | US | 8 | EIRR | SOUTHERN | TWIN FALLS | TWIN FALLS | 1 | M | 0.2218197 | 0.3569849 | M | 0.0042691 | ||||||||||||||
| 4 | 300003 | 330113 | 330116 | 16 | 31 | 16031 | ID | US | 8 | EIRR | SOUTHERN | RAFT RIVER INDUSTRIAL SPUR | RAFT RIVER IL | 1 | I | 0.1275709 | 0.2053059 | M | 0.0024842 | ||||||||||||||
| 7 | 300006 | 312341 | 312373 | 41 | 35 | 41035 | OR | US | 8 | UP | PACIFIC NORTHWEST | MODOC | #N | 1 | M | 0.7004531 | 1.1272723 | P | 0.0136099 | ||||||||||||||
| 9 | 300008 | 328030 | 328032 | 16 | 39 | 16039 | ID | US | 8 | USG | 0 | O | 0.3128218 | 0.5034390 | M | 0.0054321 |
The first task when working on the three-phase probabilistic model was to transform the raw, unstructured 72-year climate event data into a format that is suitable for comprehensive analysis. To achieve this, we executed a two-part aggregation strategy that uses both location and time as parameters from the Complete.data. Geographic data was collected at a state level, which provided a broad perspective on the distribution of climate events. Simultaneously, temporal data was consolidated monthly, allowing us to differentiate patterns and trends over time. In parallel, we gathered information regarding how often climate events occur and performed a diligent conversion of reported damages into a consistent numeric format. This Phase1 conversion was critical for further accurate quantitative assessments.
After the successful aggregation of the dataset, the next crucial phase involved the construction of our dependent variable. This variable will help predict whether a weather event is likely to occur within a two-year forecast horizon for any given geographical unit and time frame. To implement this, we appended a new column, labeled “y”, to our dataset.
Before adding this column “y”, it was imperative to generate a comprehensive matrix of all the possible geographic and temporal combinations. This exhaustive array encompassed every possible combination, even those not represented in the original dataset. For instances absent from the raw data, we assigned a default value of zero for both the event occurrences and associated damages. This preemptive measure was essential to ensure that our model accounts for periods and locations where no events were recorded.
Once the matrix is established, we derive the dependent variable by surveying each record. We marked a “1” in column “y” if any weather events are recorded for the corresponding two-year period ahead of the original date. Conversely, a “0” will denote the absence of events. In scenarios where the required two years of future data is unavailable, the entry will be left as Null or N/A. Two critical considerations accompanied this process:
Records containing Null or N/A in the dependent variable column were excluded from the dataset before the start of the model training to ensure the integrity and applicability of our predictive analytics.
If the aggregated data Phase2 at the state level yielded a uniform value of “1” across the dependent variable, indicative of an overgeneralization, it necessitates a transition to a more refined geographical granularity. This entails the adoption of smaller spatial units such as ZIP codes, hexbin locations, or a latitude-longitude grid to achieve a more discerning and useful predictive model.
To enhance the predictive capability of our model and enhance the analytical depth of our dataset, we incorporated a set of engineered features based on historical data as a part of Phase3. These features incorporated rolling lookback metrics that reflected the frequency and severity of past weather events. We also had to develop variables that quantified the cumulative number of events and aggregate damages over preceding time intervals. For a comprehensive temporal analysis, these intervals are categorized into short, medium, and long-term periods, providing a distinct window into the historical pattern of weather impacts. This approach is a multi-faceted perspective that offers variables such as “number of events in the previous three time periods” and “total damage in the previous three time periods.” By adjusting the length of these lookback windows, we can capture the immediate and extended impacts of climate events; this enhances the model’s comprehension of recent and past conditions.
Our finalized dataset will, therefore, be a robust matrix featuring these meticulously crafted lookback variables along with the original data. This sets a solid foundation for a predictive model of effective accuracy and reliability.
| Location | Event |
|---|---|
| Hawaii | Astronomical Low Tide |
| Alaska | Coastal Flood |
| Guam | Dense Smoke |
| Puerto Rico | Drought |
| Gulf of Mexico | Dust Devil |
| Atlantic North | Dust Storm |
| Atlantic South | Freezing Fog |
| Hawaii waters | Funnel Cloud |
| E Pacific | Heat |
| Virgin Islands | High Surf |
| American Samoa | Lake-Effect Snow |
| Lake Superior | Lakeshore Flood |
| Lake St Clair | Marine Hail |
| Lake Ontario | Marine High Wind |
| Lake Erie | Marine Strong Wind |
| Lake Michigan | Marine Thunderstorm Wind |
| Lake Huron | Rip Current |
| Waterspout | |
| Seiche | |
| Sleet | |
| Sneakerwave | |
| Tsunami | |
| Tropical Depression | |
| Tropical Storm | |
| Volcanic Ashfall |
The bar graph succinctly prioritizes states based on the financial impact of weather events. Texas emerges as the outlier with the highest damages, suggesting a critical review of its disaster response and infrastructure resilience is warranted. The presence of Midwestern states like Iowa and Nebraska highlights the significant toll of storms in the nation’s agricultural heartland, raising questions about the interplay between climate events and economic vulnerabilities tied to agriculture and land use. Interestingly, coastal and southern states, typically the focus of hurricane-related damages, are interspersed among less frequently discussed states like Illinois and Wisconsin. This distribution prompts us to consider a broader range of climate impacts beyond the obvious high-risk areas. The chart’s data can inform not just reactive policies but proactive investments in technology and infrastructure to fortify states against predictable damages. It also hints at the potential benefits of cross-state learning, where lower-impact states could share best practices in climate event mitigation. This analysis underscores the importance of a strategic, data-informed approach to climate resilience, where the allocation of resources is as dynamic and varied as the weather patterns themselves.
This normalized bar chart provides an adjusted view of the financial impact of weather events, accounting for the size of each state. This normalization allows for a more equitable comparison by illustrating the damage relative to the geographic area, rather than total damages which can be skewed by the size and economic activity of a state. From the chart, Iowa stands out with the highest damage per square mile, indicating that, when size is considered, its relative economic impact from weather events is the most substantial. This could reflect a high density of valuable assets or infrastructure within a smaller area, or an exceptional severity of weather events. States like Ohio, Mississippi, and New Jersey follow, suggesting these states, though varying in size and geography, face significant impacts from weather events relative to their area. Interestingly, larger states with significant total damages like Texas appear further down the list when the damage is adjusted per square mile, highlighting the importance of considering geographic scale in such analyses. The chart also informs us that less geographically extensive states with high population densities or substantial infrastructure—like Delaware and New Jersey—may experience high normalized damages, underscoring the potential for significant impact in smaller areas.
The left map illustrates the total number of weather-related events that occurred across the United States each month over the span of 72 years (1950-2022). The states are color-coded according to the volume of events, with darker shades indicating a higher frequency. This visual representation highlights regions that are more active during the first month of the year, which could be critical for understanding seasonal patterns and for preparing emergency management resources accordingly.
On the right, the map displays the normalized data of weather events, taking into account the land area of each state. By normalizing these figures, we gain a proportional insight into the intensity of weather events relative to state size. This normalization allows for an equitable comparison across states, ensuring that both large and small states can be accurately assessed for their weather event density.
Utilize the monthly tabs below to navigate through weather event data for different times of the year. This feature allows for a quick comparative analysis to identify monthly and seasonal trends in weather-related events across the nation.
Note that the ‘Total Events’ map may show significant activity in larger states, while the ‘Normalized Events’ map can reveal which states have a higher density of events per square mile. It’s crucial to consider both perspectives when assessing the impact and preparing for future weather conditions.
Network Analysis is a method used to find the shortest path using a specific network. It applies to a multitude of applications. i.e. planes, cars, social media analytics, etc. It quantifies the number of times a node acts as a bridge along the shortest path between two other nodes. While there are many different methods of conducting network analysis, betweenness was found to be the best for this project.
| States | Location |
|---|---|
| Alabama | Five Points South, Birmingham, AL, 35233 |
| Arizona | 701 W Harrison St, Phoenix, AZ, 85007 |
| Arkansas | Pine Bluff, AR, 71601 |
| California | Bakersfield, CA, 93305 |
| Colorado | North Denver, Denver, CO, 80202 |
| Connecticut | Old Saybrook, CT, 06475 |
| Delaware | 1570 Porter Rd Tunnel, Bear, DE, 19701 |
| Florida | Duval County, FL, 32234 |
| Georgia | Macon, GA, 31201 |
| Idaho | Pocatello, ID |
| Illinois | Dwight Township, IL, 60420 |
| Indiana | Wayne, Indianapolis, IN, 46234 |
| Iowa | Marshall County, IA, 50158 |
| Kansas | North Topeka West, Topeka, KS, 66608 |
| Kentucky | Lexington, KY, 40508 |
| Louisiana | Mid City North, LA, 70805 |
| Maine | 2 Newhall St, Fairfield, ME, 04937 |
| Maryland | Baltimore, MD |
| Massachusetts | Worcester, MA, 01604 |
| Michigan | Durand, MI, 48429 |
| Minnesota | Coon Rapids, MN, 55433 |
| Mississippi | Jackson, MS, 39203 |
| Missouri | Kansas City, MO, 64053 |
| Montana | Cascade County, MT, 59404 |
| Nebraska | Blaine, NE, 68901 |
| Nevada | Humboldt County, NV, 89445 |
| New Hampshire | Whitefield, NH, 03598 |
| New Jersey | Rahway City School District, Rahway, NJ |
| New Mexico | Valencia County, NM, 87002 |
| New York | Lakefront, Syracuse, NY, 13204 |
| North Carolina | Stanly County, NC, 28128 |
| North Dakota | Minot, ND, 58701 |
| Ohio | Columbus, OH 432155 |
| Oklahoma | McAlester, OK, 74501 |
| Oregon | Lloyd District, Portland, OR, 97232 |
| Pennsylvania | Rockville Bridge near Harrisburg, PA |
| Rhode island | Warwick, RI, 02886 |
| South Carolina | Columbia, SC |
| South Dakota | Wolsey, SD, 57384 |
| Tennessee | Haywood County, TN, 38012 |
| Texas | Fort Worth, TX, 7610 |
| Utah | 400w 200 S, Salt Lake City, UT, 84101 |
| Vermont | Essex Junction, VT, 05452 |
| Virginia | Burkeville, VA, 23922 |
| Washington | 4, Auburn, WA, 98001 |
| West Virginia | Mason County, WV, 25550 |
| Wisconsin | Fox Crossing, WI, 54956 |
| Wyoming | Converse County, WY, 82633 |
After calculating the betweenness centrality for the entire continental US, we found that the node with the highest value, 0.22, is located in Lima, Ohio. This is significant because the Lima railroad station served as a major hub, connecting five major continental railroads across the US in the early twentieth century. These railroads included the Pennsylvania Railroad, Baltimore and Ohio Railroad, New York Railroad, Chicago and St. Louis Railroad, Erie Railroad, and Detroit, Toledo, and Ironton Railroad. By the 1990s, all passenger rail lines had been discontinued. Currently, it has been restored and serves as both a museum and an office.
Based on the visualization, we can see that the top 5 nodes across the US are located in the same general area, although we can only see 3 points, this just means that the nodes are closely located to one another. We can further see a breakdown in location in the chart below.
Nodeid Centralityscore Latitude Longitude Location
1 435096 0.2258 40.74490 -84.1042 Lima, OH, 45801
2 435236 0.2236 40.74570 -84.0882 Lima, OH, 45804
3 416842 0.2225 41.56690 -87.4178 Calumet Township, IN
4 429164 0.2223 41.07137 -85.1283 Hanna Creighton, Fort Wayne, IN
5 429161 0.2221 41.07140 -85.1302 LaRez, Fort Wayne, IN
The purpose of this side-by-side comparison of heatmaps is to show the difference between the density of rail lines and the railway mileage of each state. The heatmap to the left shows the total railway miles per US state. From our US DOT rail data, we extracted the state column and miles column which states the distance of individual railway segments that are recorded. The total railway miles for a state were computed by summing up all of the individual railway segments. We can see that Texas has the most railway mileage, followed by Illinois and California. For Texas and California, their large square mileage and high number of export products are leading factors in the ability for more rail lines to run along these states. Illinois’s high concentration of rail lines can be explained by numerous factors such as the state being the center of many of the nation’s rail networks and Chicago being the largest US rail gateway.
The heatmap to the right shows the level of railway densities per state by using the summation of the total railway miles and dividing it by the area of a state in square miles. You can tell in states like California and Texas that had high railway mileage, the railway density is on the lower end. Some states such as Florida, Virginia, and North Carolina seem to have stayed in a relatively similar range between the railway mileage and railway density comparisons. There is a pattern of Northeastern states like Illinois, Ohio, and Pennsylvania that have high railway densities and can be historically attributed to being important hubs for the transportation of various goods and a vital junction of rail lines that run from the East to the West. We can see that New Jersey is the most densely populated in railways due to being the most densely populated state and being neighboring states to major cities like Philadelphia and New York City. The state is also home to very large economic activity and a high volume of freight rail traffic.
| States | Rail.Miles | Density.of.Miles |
|---|---|---|
| Alabama | 4378.5426 | 8.35280e-02 |
| Arizona | 2677.6307 | 2.34900e-02 |
| Arkansas | 3393.5617 | 6.38140e-02 |
| California | 9449.7252 | 5.77270e-02 |
| Colorado | 3846.1349 | 3.69490e-02 |
| Connecticut | 723.3180 | 1.30482e-01 |
| Delaware | 351.8919 | 1.41395e-01 |
| Florida | 4121.1503 | 6.26720e-02 |
| Georgia | 5683.6332 | 9.56440e-02 |
| Idaho | 2657.7008 | 3.21590e-02 |
| Illinois | 9997.6332 | 1.72630e-01 |
| Indiana | 5778.9817 | 1.58678e-01 |
| Iowa | 5168.5291 | 9.18480e-02 |
| Kansas | 6344.9123 | 7.71150e-02 |
| Kentucky | 3583.7934 | 8.86910e-02 |
| Louisiana | 3716.2512 | 7.09500e-02 |
| Maine | 1750.0355 | 4.94640e-02 |
| Maryland | 1325.4027 | 1.06836e-01 |
| Massachusetts | 1481.2951 | 1.40349e-01 |
| Michigan | 5324.2994 | 5.50520e-02 |
| Minnesota | 6360.7271 | 7.31660e-02 |
| Mississippi | 3571.2015 | 7.37370e-02 |
| Missouri | 5674.2102 | 8.14010e-02 |
| Montana | 3917.0154 | 2.66390e-02 |
| Nebraska | 4622.2144 | 5.97590e-02 |
| Nevada | 1894.2960 | 1.71320e-02 |
| New Hampshire | 633.0250 | 6.77090e-02 |
| New Jersey | 2010.2172 | 2.30495e-01 |
| New Mexico | 2878.6835 | 2.36750e-02 |
| New York | 5113.2500 | 9.37270e-02 |
| North Carolina | 4431.9766 | 8.23490e-02 |
| North Dakota | 4255.0218 | 6.01760e-02 |
| Ohio | 7557.9691 | 1.68608e-01 |
| Oklahoma | 4181.5089 | 5.98220e-02 |
| Oregon | 3303.5101 | 3.35800e-02 |
| Pennsylvania | 7455.3994 | 1.61883e-01 |
| Rhode island | 182.9365 | 1.18414e-01 |
| South Carolina | 2939.4058 | 9.17980e-02 |
| South Dakota | 2410.7400 | 3.12610e-02 |
| Tennessee | 3820.6410 | 9.06560e-02 |
| Texas | 14316.9965 | 5.33030e-02 |
| Utah | 2700.5354 | 3.18100e-02 |
| Vermont | 682.4734 | 7.09780e-02 |
| Virginia | 4309.6317 | 1.00751e-01 |
| Washington | 5452.2056 | 7.64710e-02 |
| West Virginia | 2840.0150 | 1.17210e-01 |
| Wisconsin | 4587.7059 | 7.00450e+04 |
| Wyoming | 2571.1431 | 7.00450e-02 |
We propose that Bank of America takes a cumulative look at the states that are the most vulnerable to climate event occurrences as well as noting the states that have the highest rail densities and the points of highest centrality on a state and national level. Inferences can be made from our climate and rail analysis’, which was mostly visualized separately, but further exploring this combined relationship can help with gaining a more holistic understanding of the railway routes that are vulnerable to weather events. This can assist Bank of America in conducting a precise analysis of the severity of climate event damage to railway lines.
This would likely require a climate event analysis on a more magnified scope so that we can precisely pinpoint the rail lines that are affected the most from varying climate events. Finding a climate event dataset that has the longitude and latitude for all the climate events would help in carrying out this more magnified analysis of the weather events on a county or zip code basis. We were able to pinpoint locations of top rail centrality primarily because of the longitude and latitude attributes; therefore, having this for the weather data would be very beneficial in future mappings and overlapping the weather and rail data with the highest level of precision.
This analysis would streamline the company’s economic evaluation of the regions most prone to rail vulnerabilities, which would aid Bank of America in its interactions with clients..Bank of America aims to provide insights to its clients regarding loan approvals, mortgages, and potential investments, particularly for those areas that are susceptible to impacted infrastructure.
There were a few limitations that we ran into when analyzing our data:
An important future step that should be taken is continuing our project progress by mapping the combined probabilistic model of rail lines and climate data. We created a probabilistic model with the climate data with our three-phased process, but we didn’t get to map the probabilistic model against the rail lines. This would have helped to indicate the areas in the US with the highest vulnerability when utilizing a predictive modeling method.
Another specified type of analysis that would be worth noting is categorizing high-impact and low-impact weather events as two different forms of climate events. This would help with getting a more specific understanding of the level of impact that rail infrastructure would undergo. As of now, we have all weather events combined in one when assessing secondary impacts, but all impacts aren’t realistically gauged at the same level when there is a wide range of weather events that have varying impacts. Subdividing the events into these various categories can help with making more situation-specific conclusions as to the type of impact that the weather events would pose on rail infrastructure.
In terms of rail line analysis, analyzing rail lines that run along borders would be something to look into. When our team explored the top centrality node for each state, the neighboring states were taken out of the picture, so lines that span multiple states were cut short in the state-by-state analysis. Top centrality nodes could differ when taking other connective state rail lines into account.